Design and Analysis of Repeated Surveys
نویسندگان
چکیده
This lecture will review the major issues associated with the design and analysis of repeated surveys. The interaction between the design of a repeated survey and the methods used for estimation and analysis will be examined. The choice of rotation pattern will be considered in terms of the impact on the estimation of levels and changes. Composite and other forms of estimators will be reviewed and the interaction between design and estimation explored. Estimation of seasonally adjusted and trend estimates from repeated surveys will also be considered. I. Overview of Issues in Repeated Surveys Sampling over time enables analysis of social and economic processes through estimation and analysis of changes in variables of interest. In addition to the usual design issues we need to consider the frequency of sampling, the spread and pattern of inclusion of units over time, the use of overlapping or non-overlapping samples over time and the precise pattern of overlap. Examples of types of surveys using sampling in time include: repeated, panel and longitudinal surveys; rotating panel surveys; split panel surveys and rolling samples. Factors affecting the design of a sample over time include the key estimates to be produced, the type and level of analyses to be carried out, cost, data quality and reporting load. The interaction between sampling over time and features of the design, such as stratification and cluster sampling also needs to be decided. Duncan and Kalton (1987), Kalton and Citro (1993) and Steel (2004) give a general review of issues in the design and analysis of repeated surveys. Kasprzyk et al. (1989) cover may of the important issues associated with panel surveys. Smith (1978), Binder and Hidiroglou (1988) and Fuller (1990) review estimation issues for repeated surveys. Time series may be produced from repeated surveys. The analysis of these time series may involve seasonal adjustment and trend estimation. High quality surveys are based on probability sampling methods that provide estimates of characteristics of a population and permit analysis of relationships between variables. A probability sampling methods ensures that all population members have a known, non-zero probability of selection. Common methods are simple random sampling, probability proportional to size selection, stratification, cluster and multi-stage sampling. Standard errors are estimated and used to make inferences by constructing confidence intervals. Analytical outputs such as regression coefficients can also be produced enabling relationships between variables to be analyzed. Samples of people are often obtained by selecting a sample of dwellings and including the households and people in the selected dwellings. Sampling is also used for surveys of other entities such as hospitals, schools and businesses. Sampling of physical units, such as areas of land, can also be used. The frequency of sampling depends on the purpose of the survey. A repeated survey enables estimation of changes for the population as well as cross-sectional estimate. Monitoring and detecting important changes will usually be a key reason for sampling in time. Common frequencies for repeated survey are monthly, quarterly and annual. More frequent sampling may be adopted e.g. opinion polls leading up to an election and monitoring TV ratings. Some examples of repeated surveys are monthly labour force surveys in Australia, US, Canada, Japan and Retail Trade Survey in Australia. Quarterly surveys include the Labour Force Survey in UK and Ireland and many business surveys. Factors in deciding the frequency of sampling are how quickly changes are likely to occur, how quickly decisions are needed and the budget available. Sampling should not take place so often that the sample is registering unimportant short-term movements of no practical interest. In a stratified, multi-stage design the sample should include each stratum and Primary Sampling Unit (PSU) in each period. Cost considerations may lead to each PSU being included in only one month or time period. The reference period is a fundamental part of the definition of the variables of interest. A long reference period may increase the number of episodes or incidents included in the survey but the impact of telescoping has to be considered. Other recall errors must be taken into account, which will depend on the specific variables. For some variables a twelve-month reference period might be feasible, whereas for other variables a one-day reference period might be appropriate. Some variables are defined at the time of interview, for example an opinion. The population frame must be updated to incorporate changes in the population as quickly as possible. The sample should be updated to give new units a chance of selection and to remove defunct units that may affect sampling errors. Systems to update the sampling frame and sample need to be developed. In household surveys this can be done through a master sampling frame which is updated regularly to add new housing and reflect other changes to the population. A rotation pattern can be implemented by dividing the frame into rotation groups. Use of a master sampling frame means that we can control overlap between different surveys using the same frame. For surveys of businesses the list or register has to be maintained and the sample updated. Rotation and overlap between the samples of different surveys can be controlled using permanent random number sampling. In a panel survey an initial sample is selected and interviewed on several occasions. It can provide estimates of change for variables for which information is collected at each occasion and can provide estimates for different variables over time. Cost savings arise because the first time a unit is included there are higher costs. A distinction can be made between a repeated survey and a longitudinal survey. Binder(1998) provides a review of longitudinal surveys. In a longitudinal survey an initial sample is selected and at each occasion, or wave, an attempt is made to include all the members of the initial sample. Longitudinal survey permits analysis of changes at a micro level. Examples of longitudinal surveys include: British Household Panel Survey (UK); Survey of Family, Income and Employment (New Zealand); National Longitudinal Surveys (US); Households, Income and Labour Dynamics in Australia Survey (Australia), Survey of Income and Program Participation (US). A longitudinal survey is a form of panel survey, which are designed for analysis of changes at the unit level. In a repeated survey there is not necessarily any overlap of the sample for different occasions. A rotating panel surveys also uses a sample that is followed over time, but the focus is on estimates at aggregate levels. When the emphasis is on estimates for the population an independent sample may be used on each occasion, which is often the case when the interval between the surveys is quite large. An option is to use the same sample at each occasion, with additions so that the sample estimates refer to the current population. For monthly or quarterly surveys the sample is often designed with considerable overlap between successive surveys. The sample overlap will reduce the sampling variance of estimates of change and reduce costs. Many important surveys are conducted repeatedly to give estimates of the level or mean for several time periods. When one of the objectives is to estimate changes over time a number of issues arise. Estimates of change between time periods may be as important as, or more important than, estimates of the levels or means. There may be interest in change in the level of a variable between two adjacent time periods and changes between periods s time periods apart. The time periods are usually months or quarters, but may be days or years The question arises of whether we should we use the same sample at each time period, or independent samples, or partially overlapping samples? Cost and the standard errors of estimates of movements are usually minimized by having complete overlap of the samples. Respondent load, attrition, conditioning and declining response rate usually lead to some degree of replacement or rotation of the sample from one period to the next. The main interest may be in the time series, which leads to issues of seasonal adjustment and trend estimation. A longitudinal survey can be used to provide estimates of changes at aggregate levels but these estimates refer to the population at the time of the initial sample selection unless the sample has been updated to make it representative of the current population. The main purpose of a longitudinal survey is to enable estimates of changes at the unit level. In a rotating panel survey the panel aspect is often implemented at the dwelling level, which implies that people and households are not followed when they leave a selected dwelling. People and households moving into a selected dwelling are included in the survey. This approach is suitable when the main objective is to provide unbiased aggregate estimates. In a rotating panel survey the focus is on aggregate estimates of change. However, any overlapping sample can also be used to analyze change at the micro-level. A table can be produced from the matched sample showing the change of a variable between two time periods. An example is when a table of change in status is produced, which is referred to as a Gross Flows table. It is possible to create longitudinal data from rotating panel surveys. The length of the total time period and the time interval between observations are determined by the rotation pattern used. The resulting sample of individuals for which a longitudinal data are available will be biased away from people who move permanently or are temporarily absent. An alternative to a rotating panel survey is a split panel survey, which involves a panel survey supplemented on each occasion by an independent sample. This approach permits longitudinal analysis from the panel survey for more periods than would be possible in a rotating panel design, but also cross sectional estimates obtained from the entire sample. In general the three dimension of space, time and variables need to be considered (Kish, 1998). A survey may be conducted continuously but the sample size in any time period may not be sufficient to provide reliable estimates for that period, at least for sub national estimates. By cumulating sample over several time periods reliable estimates may be produced and in this approach sample overlap is detrimental. The sample design can be developed so that it is a rolling sample with non-overlapping samples that over time cover many areas and eventually all areas. This approach can be useful in producing sub-national and small area estimates. A related approach is rolling estimates For example in the UK Labour Force Survey a non-overlapping sample is interviewed in each week of the quarter. Each month estimates based on an average of the latest 13 weeks are produced (Caplan et al., 1999). In section II some relevant theory is reviewed. Correlation modelling is considered in section III and rotation patterns and their impact are considered in section IV. Section V considers composite estimation and time series methods are mentioned ion section VI. II. Some Basic Theory Repeated surveys can provide estimates for each time periods, t y . A major value of repeated surveys is in their ability to provide estimates of change. The simplest analysis of change is the estimate of one period change, 1 − − t t y y . In a monthly survey this corresponds to one-month change. For a survey conducted annually this corresponds to annual change. In general the change s time periods apart can be estimated, using t s s t t y y y ) ( Δ = − − . The focus is often on s=1, but for a survey repeated on a monthly basis changes for s=2, 3, 12 are also commonly examined. Having sample overlap at lag s will usually lead to a positive correlation between the estimates. Since ) , ( ) ( ) ( 2 ) ( ) ( ) ( ) ( s t t s t t s t t t s y y Corr y Var y Var y Var y Var y Var − − − − + = Δ (1) having sample overlap reduces the variance of t s y ) ( Δ compared with having no sample overlap. Holt and Skinner (1989) consider the components of change in a repeated survey. A positive correlation between the estimates will reduce the variance and can often be achieved through sample overlap. If comparisons are made with time periods for which there are no sample units in common then the variance of the estimate of change will be the sum of the variances, which will often be approximately twice the variance of the estimate of the level for a particular time period. These considerations result in designing the sampling so that there is overlap between the samples for time periods between which the movements are of major interest. So, if there is strong interest in monthly movement then there should be high sample overlap between successive months. If there is also interest in changes 12 months apart then consideration should be given to designs that induce sample overlap at this lag. However, for many variables the correlation at the unit level 12 months apart may not be high enough for there to be appreciable gains from doing so. If the samples are independent between the two time periods, then . 0 ) , ( = −s t t y y corr In general there will be overlap between the samples and we would expect the degree of overlap to be a factor influencing correlation. For the very simple situation of a stable population, that is no births and death, and simple random sampling with negligible sampling fractions s t t s t t s t t R k k y y corr − − − = , ) , ( , where s t t R − , is the individual level correlation and t k is the proportion of the sample at time t that is common between periods t and t-s. Many rotation patterns are set up so that ) (s k k k s t t = = − , so that . ) ( ) , ( , s t t s t t R s k y y corr − − = If the patterns of change do not vary over time, then ) ( , s R R s t t = − . Usually we would expect the unit level correlation to be positive. More complex models for the correlation of the sampling errors are considered in section III. These results suggest that the higher the sample overlap the higher the correlation between the estimates and this leads to rotation designs with high sample overlap between periods for which the change is of interest. So for analysis of one period changes the high overlap between adjacent periods is desirable. Averaging of estimates can be used to produce more stable estimates when the original estimates have high sampling variances, for example for small sub-groups or domains in the population. A particular case is estimates for small geographic areas. Averaging over time changes the length of the time period to which the estimate refers, which will hide any variation within the period over which the average is calculated. If the interest is in averages then positive correlations increase the variance. For example the average of three consecutive months would have variance
منابع مشابه
The Qualitative Descriptive Approach in International Comparative Studies: Using Online Qualitative Surveys
International comparative studies constitute a highly valuable contribution to public policy research. Analysing different policy designs offers not only a mean of knowing the phenomenon itself but also gives us insightful clues on how to improve existing practices. Although much of the work carried out in this realm relies on quantitative appraisal of the data contained in international databa...
متن کاملEfficient Optimum Design of Steructures With Reqency Response Consteraint Using High Quality Approximation
An efficient technique is presented for optimum design of structures with both natural frequency and complex frequency response constraints. The main ideals to reduce the number of dynamic analysis by introducing high quality approximation. Eigenvalues are approximated using the Rayleigh quotient. Eigenvectors are also approximated for the evaluation of eigenvalues and frequency responses. A tw...
متن کاملAssessing reliability and validity of the Work Design Questionnaire as a tool for macro ergonomics surveys: A case study in an Iranian worker population in 2016
Background: The imbalance between job demand and controls is associated with physical and mental disorders. The Work Design Questionnaire (WDQ) is one of the newest tools for macro-ergonomics evaluation of organizations and workplaces. In this research, the reliability and validity of the Persian WDQ (PWDQ) in the evaluation of occupational accident management and safety promotion in Persian-la...
متن کاملInvestigating Orosi* in Tabriz Qajar houses (Case studies: Mashrooteh House, Heydarzadeh House, Nikdel House)
Iranian architecture has beautiful embellishments that are made of different materials such as tiles, plaster, pottery, stone and wood. Although due to Iran’s hot and dry climate, wood embellishments are less being used, in some cases they have a special and unique place in Iranian architecture. Orosi windows are one of these embellishments which have a special place in various types of archite...
متن کاملInvestigating Orosi* in Tabriz Qajar houses (Case studies: Mashrooteh House, Heydarzadeh House, Nikdel House)
Iranian architecture has beautiful embellishments that are made of different materials such as tiles, plaster, pottery, stone and wood. Although due to Iran’s hot and dry climate, wood embellishments are less being used, in some cases they have a special and unique place in Iranian architecture. Orosi windows are one of these embellishments which have a special place in various types of archite...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008